Encoding of XML Elements for Mining Association Rules

  • Hu Gongzhu (Department of Computer Science Central Michigan University) ;
  • Liu Yan (Central Michigan University, USA) ;
  • Huang Qiong (Central Michigan University, USA)
  • Published : 2005.12.01

Abstract

Mining of association rules is to find associations among data items that appear together in some transactions or business activities. As of today, algorithms for association rule mining, as well as for other data mining tasks, are mostly applied to relational databases. As XML being adopted as the universal format for data storage and exchange, mining associations from XML data becomes an area of attention for researchers and developers. The challenge is that the semi-structured data format in XML is not directly suitable for traditional data mining algorithms and tools. In this paper we present an encoding method to encode XML tree-nodes. This method is used to store the XML data in Value Table and Transaction Table that can be easily accessed via indexing. The hierarchical relationship in the original XML tree structure is embedded in the encoding. We applied this method to association rules mining of XML data that may have missing data.

Keywords