Title :
Novel vertical mining on Diffsets structure
Author :
Consue, Wootipong ; Kurutach, Werasak
Author_Institution :
Dept. of Comput. Eng., Mahanakorn Univ. of Technol., Bangkok, Thailand
Abstract :
Mining frequent patterns on the vertical data structures usually shows improvements of performance over the classical horizontal structure. This is because the vertical data structure supports fast frequency counting via intersection operations on transaction identifiers (TIDs). Recently, Diffsets by M.J. Zaki and K. Gouda (2001), a vertical data representation, has been introduced for the sake of the size of memory required to store intermediate TIDs in the mining process. In this paper, we present a new vertical mining algorithm on the Diffset structure called Fast Diffsets Vertical Mining (FDVM). Primarily, FDVM uses the concept of pattern growth on the Diffset structure, and we show that FDVM outperforms previous methods in mining the complete set of frequent patterns. Our experimental results indicate that significant performance improvement can be gained, especially for large databases, over previously proposed vertical and horizontal mining algorithms.
Keywords :
data mining; data structures; database theory; Diffsets structure; Fast Diffsets Vertical Mining algorithm; classical horizontal structure; data mining; fast frequency counting; horizontal mining algorithm; intersection operation; memory; pattern growth; performance improvement; transaction identifier; vertical data representation; vertical data structure; Association rules; Data engineering; Data mining; Data structures; Electronic mail; Frequency; Information technology; Itemsets; Multidimensional systems; Transaction databases;
Conference_Titel :
Intelligent Agent Technology, 2003. IAT 2003. IEEE/WIC International Conference on
Print_ISBN :
0-7695-1931-8
DOI :
10.1109/IAT.2003.1241095