Title :
Fast frequent string mining using suffix arrays
Author :
Fischer, Johannes ; Heun, Volker ; Kramer, Stefan
Author_Institution :
Inst. fur Informatik, Univ. Miinchen Amalienstr, Germany
Abstract :
We present a method to mine strings that are frequent in one database and infrequent in another. The method uses suffix- and lcp-arrays that can be computed extremely fast and space efficiently, and further exhibit a good locality behavior. Experiments with several biologically relevant data sets show that our approach outperforms existing methods in terms of time and space.
Keywords :
computational complexity; data mining; string matching; fast frequent string mining; lcp arrays; suffix arrays; Biology computing; Computational biology; Data mining; Frequency; Lattices; Sequences; Spatial databases;
Conference_Titel :
Data Mining, Fifth IEEE International Conference on
Print_ISBN :
0-7695-2278-5
DOI :
10.1109/ICDM.2005.62