Title :
Mining Association Rules from Complex and Irregular XML Documents Using XSLT and Xquery
Author :
Wang, Xinwei ; Cao, Chunjing
Author_Institution :
Dept. of Comput. Sci. & Technol., East China Normal Univ., Shanghai
Abstract :
Currently, XML has been penetrating all areas of Internet for exchanging data. This fast growing usage of XML makes great amount data sources of XML data available and raises the need for languages, methods and tools to extract knowledge through the collections of XML documents. To date, the famous A priori algorithm to mine any XML document for association rules without any pre-processing or post-processing has been implemented using only the XQuery language. But the algorithm only can mine the set of items that can be written a path expression for. However, the structure of the XML data can be more complex and irregular than that. Consequently, it is difficult to identify the mining context. In this paper, we introduced XSL and XSLT which are also proposed by W3C to do some preprocessing of the input XML documents, transform the complex and irregular XML document into simple and regular XML document which can meet the needs of our mining algorithm. Our preprocessing makes the algorithm much more adaptable and universal.
Keywords :
Internet; XML; data mining; Internet; XML documents; XSLT; Xquery; association rules; knowledge extraction; Association rules; Computer science; Data mining; Data preprocessing; Decision trees; Information technology; Internet; Neural networks; Relational databases; XML; Apriori algorithm; XML; XQuery; XSL;
Conference_Titel :
Advanced Language Processing and Web Information Technology, 2008. ALPIT '08. International Conference on
Conference_Location :
Dalian Liaoning
Print_ISBN :
978-0-7695-3273-8
DOI :
10.1109/ALPIT.2008.48