DOI QR코드

DOI QR Code

Deep Learning Framework with Convolutional Sequential Semantic Embedding for Mining High-Utility Itemsets and Top-N Recommendations

  • Siva S (Department of Computer Science and Applications, Reva University) ;
  • Shilpa Chaudhari (Department of Computer Science and Engineering, MS Ramaiah Institute of Technology)
  • 투고 : 2023.05.24
  • 심사 : 2023.11.04
  • 발행 : 2024.03.31

초록

High-utility itemset mining (HUIM) is a dominant technology that enables enterprises to make real-time decisions, including supply chain management, customer segmentation, and business analytics. However, classical support value-driven Apriori solutions are confined and unable to meet real-time enterprise demands, especially for large amounts of input data. This study introduces a groundbreaking model for top-N high utility itemset mining in real-time enterprise applications. Unlike traditional Apriori-based solutions, the proposed convolutional sequential embedding metrics-driven cosine-similarity-based multilayer perception learning model leverages global and contextual features, including semantic attributes, for enhanced top-N recommendations over sequential transactions. The MATLAB-based simulations of the model on diverse datasets, demonstrated an impressive precision (0.5632), mean absolute error (MAE) (0.7610), hit rate (HR)@K (0.5720), and normalized discounted cumulative gain (NDCG)@K (0.4268). The average MAE across different datasets and latent dimensions was 0.608. Additionally, the model achieved remarkable cumulative accuracy and precision of 97.94% and 97.04% in performance, respectively, surpassing existing state-of-the-art models. This affirms the robustness and effectiveness of the proposed model in real-time enterprise scenarios.

키워드

참고문헌

  1. P. Fournier-Viger, J. C. W Lin, R. U. Kiran, Y. S. Koh, and R. Thomas, "A survey of sequential pattern mining," Data Science and Pattern Recognition, vol. 1, no. 1, pp. 54-77, Feb. 2017.
  2. M. J. Zaki, "Scalable algorithms for association mining," IEEE Transactions on Knowledge and Data Engineering,, vol. 12, no. 3, pp. 372-390, 2000. DOI: 10.1109/69.846291.
  3. J. Han, J. Pei, and M. Kamber, "Data mining: concepts and techniques," Elsevier, Amsterda, 2011.
  4. R. Agrawal and R. Srikant, "Mining sequential patterns," In Proceedings of the Eleventh International Conference on Data Engineering, Taipei, Taiwan, pp 3-14, 1995. DOI: 10.1109/ICDE.1995.380415.
  5. K. K. Sethi and D. Ramesh, "A fast high average-utility itemset mining with efficient tighter upper bounds and novel list structure," The Journal of Supercomputing, Springer, vol. 76, no. 12, pp. 10288-10318, Mar. 2020. DOI: 10.1007/s11227-020-03247-5..
  6. R. Agrawal, T. Imielinski, and A. Swami, "Mining association rules between sets of items in large databases," in Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, USA, pp 207-216, 1993. DOI: 10.1145/170035.170072.
  7. R. Agrawal and R. Srikant; "Fast algorithms for mining association rules," in Proceedings of the 20th International Conference on Very Large Data Bases, pp 487-499, 1994.
  8. P. Fournier-Viger, J. C.-W. Lin, B. Vo, T. T. Chi, J. Zhang, and H. B. Le, "A survey of itemset mining," WIREs Data Mining and Knowledge Discovery, vol. 7, no. 4, Apr. 2017. DOI: 10.1002/widm.1207.
  9. T. Wei, B. Wang, Y. Zhang, K. Hu, Y. Yao, and H. Liu, "FCHUIM: Efficient Frequent and Closed High-Utility Itemsets Mining," IEEE Access, vol. 8, pp. 109928-109939, 2020. DOI: 10.1109/ACCESS.2020.3001975.
  10. G. Grahne and J. Zhu, "Fast algorithms for frequent itemset mining using fp-trees," IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 10, pp. 1347-1362, Oct. 2005. DOI: 10.1109/TKDE.2005.166.
  11. J. Pei, J. Han, H. Lu, S. Nishio, S. Tang, and D. Yang, "H-mine: hyper-structure mining of frequent patterns in large databases," in ICDM 2001, Proceedings IEEE International Conference on Data Mining, San Jose, USA, pp. 441-448, 2001. pp 441-448, DOI: 10.1109/ICDM.2001.989550..
  12. V. S. Tseng, B.-E. Shie, C.-W. Wu, and P. S. Yu, "Efficient algorithms for mining high utility itemsets from transactional databases," IEEE Transactions on Knowledge and Data Engineering, vol. 25, no. 8, pp. 1772-1786, Aug. 2013. DOI: 10.1109/TKDE.2012.59.
  13. R. Chan, Q. Yang, Y. D. Shen, "Mining high utility itemsets," in IEEE International Conference on Data Mining, Melbourne, Florida, pp 19-26, 2003, DOI: 10.1109/ICDM.2003.1250893.
  14. H. Yao, H. J. Hamilton, C. J. Butz; "A foundational approach to mining itemset utilities from databases," in Proceedings of the 2004 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, Lake Buena Vista, USA, pp 482-486 2004. DOI: 10.1137/1.9781611972740.51.
  15. W. Song, Y. Liu, and J. Li, "BAHUI: Fast and memory efficient mining of high utility itemsets based on bitmap," International. Journal of. Data Warehousing Mining, vol. 10, no. 1, pp. 1-15, Jan. 2014. DOI: 10.4018/ijdwm.2014010101.
  16. Y. Liu, W. K. Liao, and A. N. Choudhary, "A two-phase algorithm for fast discovery of high utility itemsets," in Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), Hanoi, Vietnam, pp. 689-695, 2005. DOI: 10.1007/11430919_79.
  17. Y. C. Li, J. S. Yeh, and C. C. Chang, "Isolated items discarding strategy for discovering high utility itemsets," Data Knowledge Engineering, vol. 64, no. 1, pp.198-217, Jan. 2008. DOI: 10.1016/j.datak.2007.06.009.
  18. C. F. Ahmed, S. K. Tanbeer, B. S. Jeong, and Y. K. Lee, "Efficient tree structures for high utility pattern mining in incremental databases," IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 12, pp. 1708-1721, Dec. 2009. DOI: 10.1109/TKDE.2009.46.
  19. V. S. Tseng, C. W. Wu, B. E. Shie, and P. S. Yu, "UP-growth: an efficient algorithm for high utility itemset mining," in Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington DC, USA, pp 253-262, 2010. DOI: 10.1145/1835804.1835839
  20. M. Liu and J. Qu, "Mining high utility itemsets without candidate generation," in Proceedings of the 21st ACM International Conference on Information and Knowledge Management, Maui, USA, pp 55-64, 2012. DOI: 10.1145/2396761.2396773.
  21. P. Fournier-Viger, C. W. Wu, S. Zida, and V. S. Tseng, "FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning," in Foundations of Intelligent Systems: 21st International Symposium, ISMIS 2014,, Roskilde, Denmark, pp 83-92, 2014. DOI: 10.1007/978-3-319-08326-1_9.
  22. T. P. Hong, C. H. Lee, and S. L. Wang, "Effective utility mining with the measure of average utility," Expert Systems Applications, vol. 38, no. 7, pp. 8259-8265, 2011. DOI: 10.1016/j.eswa.2011.01.006.
  23. G. C. Lan, T. P. Hong, and V. S. Tseng, "A projection-based approach for discovering high average utility itemsets," Journal of Information Science and Engineering, vol. 28, no. 1, pp. 193-209, 2012.
  24. C. W. Lin, T. P. Hong, and W. H. Lu ,"Efficiently mining high average utility itemsets with a tree structure," in Asian Conference on Intelligent Information and Database Systems, Hue City, Vietnam,, pp 131-139, 2010. DOI: 10.1007/978-3-642-12145-6_14.
  25. A. Y. Peng, Y. S. Koh, and P. Riddle, "mHUIMiner: A fast high utility itemset mining algorithm for sparse datasets," in Advances in Knowledge Discovery and Data Mining, Jeju, Korea, pp. 196-207, 2017. DOI: 10.1007/978-3-319-57529-2_16.
  26. J. Pei, J. Han, and L. V. Lakshmanan, "Pushing convertible constraints in frequent itemset mining," Data Mining and Knowledge Discovery, vol. 8, no. 3, pp. 227-252, May 2004. DOI: 10.1023/B:DAMI.0000023674.74932.4c.
  27. K. K. Sethi and D. Ramesh, "HFIM: a Spark-based hybrid frequent itemset mining algorithm for big data processing," The Journal of Supercomputing, vol. 73, no. 8, pp. 3652-3668, Jan. 2017. DOI: 10.1007/s11227-017-1963-4.
  28. G. Pyun, U. Yun, and K. H. Ryu, "Efficient frequent pattern mining based on linear prefix tree," Knowledge-Based Systems, vol. 55, pp. 125-139, Jan. 2014. DOI: 10.1016/j.knosys.2013.10.013.
  29. U. Yun, G. Lee, and K. H. Ryu, "Mining maximal frequent patterns by considering weight conditions over data stream," Knowledge. Based Systems vol. 55, pp. 49-65, Jan. 2014. DOI: 10.1016/j.knosys.2013.10.011.
  30. C. W. Lin, T. P. Hong, and W. H. Lu, "An effective tree structure for mining high utility itemsets," Expert Systems with Applications, vol. 38, no. 6, pp. 7419-7424, Jun. 2011. DOI: 10.1016/j.eswa.2010.12.082.
  31. V. S. Tseng, B. E. Shie, C. W. Wu, and P. S. Yu, "Efficient algorithms for mining high utility itemsets from transactional databases," IEEE Trans Knowl Data Engineering, vol. 25, no. 8, pp. 1772-1786, Aug. 2013. DOI: 10.1109/TKDE.2012.59.
  32. U. Yun, H. Ryang, and K. H. Ryu, "High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates," Expert Systems with Applications, vol. 41, no. 8, pp. 3861-3878, 2014. DOI: 10.1016/j.eswa.2013.11.038
  33. G. C Lan, T. P. Hong, and V. S Tseng, "An efficient projection-based indexing approach for mining high utility itemsets," Knowledge Information Systems, vol. 38, no. 1, pp. 85-107, Aug. 2013. DOI: 10.1007/s10115-012-0492-y.
  34. S. Krishnamoorthy "Pruning strategies for mining high utility itemsets," Expert Systems with Applications, vol 42, no. 5, pp. 2371-2381, Apr. 2015. DOI: 10.1016/j.eswa.2014.11.001.
  35. S. Zida, P. Fournier-Viger, J. C. W. Lin, C. W. Wu, and V. S. Tseng, "EFIM: a highly efficient algorithm for high-utility itemset mining," in 14th Mexican International Conference on Artificial Intelligence, Cuernavaca, Mexico, pp 530-546, 2015. DOI: 10.1007/978-3-319-27060-9_44.
  36. S. Krishnamoorthy, "HMiner: efficiently mining high utility itemsets," Expert Systems with Applications, vol. 90, pp. 168-183, Dec. 2017. DOI: 10.1016/j.eswa.2017.08.028.
  37. W. Song, Y. Liu, J. Li, "BAHUI: fast and memory efficient mining of high utility itemsets based on bitmap,". International Journal Data Warehouse Mining (IJDWM), vol. 10, no. 1, pp. 1-15, Jan. 2014. DOI: 10.4018/ijdwm.2014010101.
  38. J. C. W. Lin, L. YangL, P. Fournier-Viger, J. M. T. Wu, T. P.. Hong, L. S. L. Wang, and J. Zhan, "Mining high utility itemsets based on particle swarm optimization," Engineering Applications Artificial Intellegence, vol. 55, pp. 320-330, Oct. 2016. DOI: 10.1016/j.engappai.2016.07.006.
  39. P. Fournier-Viger, J. C. W. Lin, C. W.. Wu, V. S. Tseng, and U. Faghihi "Mining minimal high-utility itemsets," in International Conference on Database and Expert Systems Applications, Porto, Portugal, pp 88-101, 2016. DOI: 10.1007/978-3-319-44403-1_6.
  40. P. Fournier-Viger, J. C. W. Lin, Q. H. Duong, and T. L. Dam "FHM+: faster high-utility itemset mining using length upper-bound reduction," in International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, Morioka, Japan,pp 115-127, 2016. DOI: 10.1007/978-3-319-42007-3_11.
  41. T. Lu, B. Vo, H. T. Nguyen, and T. P. Hong, "A new method for mining high average utility itemsets," in 13th IFIP TC 8 International Conference, Computer Information Systems and Industrial Management 2014, Ho Chi Minh City, Vietnam,, pp 33-42, 2014. DOI: 10.1007/978-3-662-45237-0_5.
  42. C. W. Lin, T. P. Hong, W. H. Lu, "Efficiently mining high average utility itemsets with a tree structure", in Asian Conference on Intelligent Information and Database Systems, Hue City, Vietnam,, pp 131-139, 2010. DOI: 10.1007/978-3-642-12145-6_14.
  43. J. C. W. Lin, T. Li, P. Fournier-Viger, T. P. Hong, J. Zhan, and M. Voznak, "An efficient algorithm to mine high average-utility itemsets," Advanced Engineering Information, vol. 30, no. 2, pp. 233-243, Apr. 2016.DOI : 10.1016/j.aei.2016.04.002.
  44. J. C. W. Lin, S. Ren, P. Fournier-Viger, and T. P. Hong, "EHAUPM: efficient high average-utility pattern mining with tighter upper bounds," IEEE Access, vol. 5, pp. 12927-12940, 2017. DOI: 10.1109/ACCESS.2017.2717438.
  45. U. Yun and D. Kim "Mining of high average-utility itemsets using novel list structure and pruning strategy," Future Generation Computer Systems, vol. 68, pp. 346-360, Mar. 2017. DOI: 10.1016/j.future.2016.10.027.
  46. J. C. W. Lin, S. Ren, P. Fournier-Viger, T. P. Hong, J. H Su, and B. Vo, "A fast algorithm for mining high average-utility itemsets," Application Intellegence Systems, vol. 47, no. 2, pp. 331-346, Mar. 2017. DOI: 10.1007/s10489-017-0896-1.
  47. J. C. W. Lin, S. Ren, and P. Fournier-Viger, "MEMU: more efficient algorithm to mine high average utility patterns with multiple minimum average-utility thresholds," IEEE Access, vol. 6, pp. 7593-7609, 2018. DOI: 10.1109/ACCESS.2018.2801261.
  48. J. M. T. Wu, J. C. W. Lin, M. Pirouz, and P. Fournier-Viger, "TUB-HAUPM: tighter upper bound for mining high average-utility patterns," IEEE Access, vol. 6, pp. 18655-18669, 2018. DOI: 10.1109/ACCESS.2018.2820740.
  49. T. Truong, H. Duong, B. Le, and P. Fournier-Viger, "Efficient vertical mining of high average-utility itemsets based on novel upper-bounds," IEEE Transactions on Knowledge and Data Engineering, 2018, vol. 31, no. 2, pp. 301-314, Feb. 2019. DOI: 10.1109/TKDE.2018.2833478.
  50. T. Truong, H. Duong, B. Le, P. Fournier-Viger, and U. Yun, "Efficient high average-utility itemset mining using novel vertical weak upper-bounds," Knowledge-Based Systems, vol. 183, pp. 104847, Nov. 2019. DOI: 10.1016/j.knosys.2019.07.018.
  51. V. S. Tseng, C.-W. Wu, P. Fournier-Viger, and P. S. Yu, "Efficient algorithms for mining the concise and lossless representation of high utility itemsets," IEEE Transactions on Knowledge and Data Engineering, vol. 27, no. 3, pp. 726-739, Mar. 2015. DOI: 10.1109/TKDE.2014.2345377.
  52. C.-W. Wu, P. Fournier-Viger, J.-Y. Gu, and V. S. Tseng, "Mining closed++ high utility itemsets without candidate generation," in 2015 Conference on Technologies and Applications of Artificial Intelligence (TAAI), Tainan, Taiwan, pp. 187-194, 2015. DOI: 1109/TAAI.2015.7407089. 1109/TAAI.2015.7407089
  53. P. Fournier-Viger, S. Zida, J. C.-W. Lin, C.-W. Wu, and V. S. Tseng, "EFIM-closed: Fast and memory efficient discovery of closed high-utility itemsets," in 12th International Conference, Machine Learning and Data Mining in Pattern Recognition, New York, USA, pp. 199-213, 2016. DOI: 10.1007/978-3-319-41920-6_15.
  54. T.-L. Dam, K. Li, P. Fournier-Viger, and Q.-H. Duong, "CLS-Miner: Efficient and effective closed high-utility itemset mining," Frontiers Computer. Science., vol. 13, no. 2, pp. 357-381, Apr. 2019, DOI: 10.1007/s11704-016-6245-4.
  55. L. T. Hong Van, P. Van Huong, L. D. Thuan, and N. Hieu Minh, "Improving the feature set in IoT intrusion detection problem based on FP-growth algorithm," in International Conference on Advanced Technologies for Communications (ATC), Nha Trang, Vietnam, pp. 18-23, 2020. DOI: 10.1109/ATC50776.2020.9255431.
  56. M. I. M. Ishag, K. H. Park, J. Y. Lee, and K. H. Ryu, "A pattern-based academic reviewer recommendation combining author-paper and diversity metrics," IEEE Access, vol. 7, pp. 16460-16475, 2019. DOI: 10.1109/ACCESS.2019.2894680.
  57. X. Wang, Y. Sheng, H. Deng and Z. Zhao, "Top-N-targets-balanced recommendation based on attentional sequence-to-sequence learning," IEEE Access, vol. 7, pp. 120262-120272, 2019. DOI: 10.1109/ACCESS.2019.2937557.
  58. W. Zhou, J. Li, M. Zhang, Y. Wang, and F. Shah, "Deep learning modeling for top-N recommendation with interests exploring," IEEE Access, vol. 6, pp. 51440-51455, 2018. DOI: 10.1109/ACCESS.2018.2869924.
  59. V. Baghi, S. M. Seyed Motehayeri, A. Moeini, and R. Abedian, "Improving ranking function and diversification in interactive recommendation systems based on deep reinforcement learning," 26th International Computer Conference, Computer Society of Iran (CSICC), Tehran, Iran, pp. 1-7, 2021. DOI: 10.1109/CSICC52343.2021.9420615.
  60. J. Lv, B. Song, J. Guo, X. Du, and M. Guizani, "Interest-related item similarity model based on multimodal data for top-N recommendation," IEEE Access, vol. 7, pp. 12809-12821, 2019. DOI: 10.1109/ACCESS.2019.2893355.
  61. Y. Zeng, Z. Qu, and B. Zhou, "Trust-aware sequence recommendation based on attention mechanism, in IEEE 5th international conference on cloud computing and big data analytics, Chengdu, China, pp. 45-50, 2020. DOI: 10.1109/ICCCBDA49378.2020.9095580.
  62. IJCAI. Repeat Buyers Prediction Competition [Online], Available: https://ijcai-15.org/repeat-buyers-prediction-competition/.
  63. Kaggle. Association Rules Mining/Market Basket Analysis [Online], Available: https://www.kaggle.com/datatheque/association-rules-mining-market-basket-analysis.
  64. H.-J. Xue, X. Dai, J. Zhang, S. Huang, and J. Chen, "Deep matrix factorization models for recommender systems," in Proceedings of 26th International Joint Conference on Artificial Intelligence (IJCAI-17), pp. 3203-3209, 2017.
  65. V. Umayaparvathi and K. Iyakutti, "Automated feature selection and churn prediction using deep learning models," International Research Journal of Engineering and Technology (IRJET), vol. 4, no. 3, pp. 1846-1854, Mar. 2017.
  66. P. Ghadekar and A. Dombe, "Image- Based Product Recommendations Using Market Basket Analysis," in 2019 5th International Conference On Computing, Communication, Control And Automation (ICCUBEA), Pune, India, pp. 1-5, 2019. DOI: 10.1109/ICCUBEA47591.2019.9128524.
  67. O. F. Seymen, O. Dogan, and A. Hiziroglu, "Customer churn prediction using deep learning," in Proceedings of the 12th International Conference on Soft computing and Pattern Recognition (SoCPaR 2020), Online, pp. 520-529, 2021. DOI: 10.1109/ICCUBEA47591.2019.9128524.
  68. N. Pazhaniraja, S. Sountharrajan, E. Suganya, and M. Karthiga,"Top 'N' Variant Random Forest Model for High Utility Itemsets Recommendation,"EAI Endorsed Transactions on Energy Web, | vol. 8, no. 35, pp. 1-7, Jan. 2021. DOI: 10.4108/eai.25-1-2021.168225.