A Novel Approach for Mining High-Utility Sequential Patterns in Sequence Databases

Ahmed, Chowdhury Farhan;Tanbeer, Syed Khairuzzaman;Jeong, Byeong-Soo;

doi:10.4218/etrij.10.1510.0066

ETRI Journal

Volume 32 Issue 5
/
Pages.676-686
/
2010
/
1225-6463(pISSN)
/
2233-7326(eISSN)

Electronics and Telecommunications Research Institute (한국전자통신연구원)

DOI QR Code

A Novel Approach for Mining High-Utility Sequential Patterns in Sequence Databases

Ahmed, Chowdhury Farhan (Database Lab, Department of Computer Engineering, Kyung Hee University) ;
Tanbeer, Syed Khairuzzaman (Database Lab, Department of Computer Engineering, Kyung Hee University) ;
Jeong, Byeong-Soo (Database Lab, Department of Computer Engineering, Kyung Hee University)

Received : 2010.03.10
Accepted : 2010.08.02
Published : 2010.10.31

https://doi.org/10.4218/etrij.10.1510.0066 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Mining sequential patterns is an important research issue in data mining and knowledge discovery with broad applications. However, the existing sequential pattern mining approaches consider only binary frequency values of items in sequences and equal importance/significance values of distinct items. Therefore, they are not applicable to actually represent many real-world scenarios. In this paper, we propose a novel framework for mining high-utility sequential patterns for more real-life applicable information extraction from sequence databases with non-binary frequency values of items in sequences and different importance/significance values for distinct items. Moreover, for mining high-utility sequential patterns, we propose two new algorithms: UtilityLevel is a high-utility sequential pattern mining with a level-wise candidate generation approach, and UtilitySpan is a high-utility sequential pattern mining with a pattern growth approach. Extensive performance analyses show that our algorithms are very efficient and scalable for mining high-utility sequential patterns.

Keywords

References

R. Agrawal and R. Srikant, "Mining Sequential Patterns," Proc. 11th Int. Conf. Data Eng., 1995, pp. 3-14.
R. Srikant and R. Agrawal, "Mining Sequential Patterns: Generalizations and Performance Improvements," Proc. 5th Int. Conf. Extending Database Technol., 1996, pp. 3-17.
M.J. Zaki, "SPADE: An Efficient Algorithm for Mining Frequent Sequences," Mach. Learning, vol. 42, no. 1-2, Jan. 2001, pp. 31- 60. https://doi.org/10.1023/A:1007652502315
J. Ayres et al., "Sequential Pattern Mining Using a Bitmap Representation," Proc. 8th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2002, pp. 429-435.
J. Pei et al., "Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach," IEEE Trans. Knowl. Data Eng., vol. 16, no. 11, Oct. 2004, pp. 1424-1440. https://doi.org/10.1109/TKDE.2004.77
J. Pei et al., "PrefixSpan: Mining Sequential Patterns by Prefix- Projected Growth," Proc. 17th Int. Conf. Data Eng., 2001, pp. 215-224.
J. Wang, J. Han, and C. Li, "Frequent Closed Sequence Mining without Candidate Maintenance," IEEE Trans. Knowl. Data Eng., vol. 19, no. 8, 2007, pp. 1042-1056. https://doi.org/10.1109/TKDE.2007.1043
H. Yao, H.J. Hamilton, and C.J. Butz, "A Foundational Approach to Mining Itemset Utilities from Databases," Proc. 3rd SIAM Int. Conf. Data Mining, 2004, pp. 482-486.
H. Yao and H.J. Hamilton, "Mining Itemset Utilities from Transaction Databases," Data Knowl. Eng., vol. 59, no. 3, 2006, pp. 603-626. https://doi.org/10.1016/j.datak.2005.10.004
Y. Liu, W.K. Liao, and A. Choudhary, "A Two Phase Algorithm for Fast Discovery of High Utility of Itemsets," Proc. 9th Pacific- Asia Conf. Knowl. Discovery Data Mining , 2005, pp. 689-695.
C.F. Ahmed et al., "An Efficient Candidate Pruning Technique for HUP Mining," Proc.13th Pacific-Asia Conf. Knowl. Discovery Data Mining, 2009, pp. 749-756.
Y.C. Li, J.S. Yeh, and C.C. Chang, "Isolated Items Discarding Strategy for Discovering High Utility Itemsets," Data Knowl. Eng., vol. 64, no. 1, 2008, pp. 198-217. https://doi.org/10.1016/j.datak.2007.06.009
C.F. Ahmed et al., "Efficient Tree Structures for HUP Mining in Incremental Databases," IEEE Trans. Knowl. Data Eng., vol. 21, no. 12, 2009, pp. 1708-1721. https://doi.org/10.1109/TKDE.2009.46
R. Agrawal and R. Srikant, "Fast Algorithms for Mining Association Rules in Large Databases," Proc. 2nd Int. Conf. Very Large Data Bases, 1994, pp. 487-499.
J. Han et al., "Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach," Data Mining Knowl. Discovery, vol. 8, 2004, pp. 53-87. https://doi.org/10.1023/B:DAMI.0000005258.31418.83
J. Han et al., "Frequent Pattern Mining: Current Status and Future Directions," Data Mining Knowl. Discovery, vol. 15, no. 1, 2007, pp. 55-86. https://doi.org/10.1007/s10618-006-0059-1
M.N. Garofalakis, R. Rastogi, and K. Shim, "SPIRIT: Sequential Pattern Mining with Regular Expression Constraints," Proc. 25th Int. Conf. Very Large Data Bases, 1999, pp. 223-234.
J. Pei, J. Han, and W. Wang, "Mining Sequential Patterns with Constraints in Large Databases," Proc. 11th Int. Conf. Inform. Knowl. Management, 2002, pp. 18-25.
U. Yun, "A New Framework for Detecting Weighted Sequential Patterns in Large Sequence Databases," Knowl.-Based Syst., vol. 21, no. 2, 2008, pp. 110-122. https://doi.org/10.1016/j.knosys.2007.04.002
U. Yun, "WIS: Weighted Interesting Sequential Pattern Mining with a Similar Level of Support and/or Weight," ETRI J., vol. 29, no. 3, June 2007, pp. 336-352. https://doi.org/10.4218/etrij.07.0106.0067
C. Kim et al., "SQUIRE: Sequential Pattern Mining with Quantities," J. Syst. Software, vol. 80, no. 10, 2007, pp. 1726- 1745. https://doi.org/10.1016/j.jss.2006.12.562
http://www.almaden.ibm.com/cs/projects/iis/hdb/Projects/data_ mining/datasets/syndata.html
Frequent Itemset Mining Dataset Repository. Available at: http://fimi.cs.helsinki.fi/data/
Z. Zheng, R. Kohavi, and L. Mason, "Real World Performance of Association Rule Algorithms," Proc. 7th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2001, pp. 401-406.

피인용 문헌

CRoM and HuspExt: Improving Efficiency of High Utility Sequential Pattern Extraction vol.27, pp.10, 2015, https://doi.org/10.1109/tkde.2015.2420557
On efficiently mining high utility sequential patterns vol.49, pp.2, 2010, https://doi.org/10.1007/s10115-015-0914-8
Mining High Utility Sequential Patterns with Negative Item Values vol.31, pp.10, 2010, https://doi.org/10.1142/s0218001417500355
Efficiently mining high utility sequential patterns in static and streaming data vol.21, pp.None, 2010, https://doi.org/10.3233/ida-170874
Mining of high utility-probability sequential patterns from uncertain databases vol.12, pp.7, 2010, https://doi.org/10.1371/journal.pone.0180931
Mining significant high utility gene regulation sequential patterns vol.11, pp.suppl6, 2010, https://doi.org/10.1186/s12918-017-0475-4
Efficient High Utility Negative Sequential Patterns Mining in Smart Campus vol.6, pp.None, 2010, https://doi.org/10.1109/access.2018.2827167
Mining High Utility Sequential Patterns Using Multiple Minimum Utility vol.32, pp.10, 2010, https://doi.org/10.1142/s0218001418590176
시퀀스 유틸리티 리스트를 사용하여 높은 유틸리티 순차 패턴 탐사 기법 vol.7, pp.2, 2018, https://doi.org/10.3745/ktsde.2018.7.2.51
On Incremental High Utility Sequential Pattern Mining vol.9, pp.5, 2010, https://doi.org/10.1145/3178114
An efficient algorithm for mining periodic high-utility sequential patterns vol.48, pp.12, 2018, https://doi.org/10.1007/s10489-018-1227-x
An Algorithm for Mining High Utility Sequential Patterns with Time Interval vol.19, pp.4, 2010, https://doi.org/10.2478/cait-2019-0032
An Efficient Algorithm for Extracting High-Utility Hierarchical Sequential Patterns vol.2020, pp.None, 2010, https://doi.org/10.1155/2020/8816228
Dramatically Reducing Search for High Utility Sequential Patterns by Maintaining Candidate Lists vol.11, pp.1, 2010, https://doi.org/10.3390/info11010044
High average-utility sequential pattern mining based on uncertain databases vol.62, pp.3, 2020, https://doi.org/10.1007/s10115-019-01385-8
Mining High-utility Temporal Patterns on Time Interval-based Data vol.11, pp.4, 2010, https://doi.org/10.1145/3391230
Utility Mining across Multi-Sequences with Individualized Thresholds vol.1, pp.2, 2010, https://doi.org/10.1145/3362070
e-HUNSR: An Efficient Algorithm for Mining High Utility Negative Sequential Rules vol.12, pp.8, 2010, https://doi.org/10.3390/sym12081211
Utility Mining Across Multi-Dimensional Sequences vol.15, pp.5, 2010, https://doi.org/10.1145/3446938
On-Shelf Utility Mining of Sequence Data vol.16, pp.2, 2010, https://doi.org/10.1145/3457570
Multi-core parallel algorithms for hiding high-utility sequential patterns vol.237, pp.None, 2010, https://doi.org/10.1016/j.knosys.2021.107793
Scalable Mining of High-Utility Sequential Patterns With Three-Tier MapReduce Model vol.16, pp.3, 2010, https://doi.org/10.1145/3487046

ETRI Journal

A Novel Approach for Mining High-Utility Sequential Patterns in Sequence Databases

Abstract

Keywords

References

피인용 문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)