DocumentCode :
3275439
Title :
Discovering maximal subsequence patterns in sequence database
Author :
Singhal, Leena ; Jain, Neha ; Gupta, Geeta ; Gupta, Neelima
Author_Institution :
Dept. of Comput. Sci., Univ. of Delhi, Delhi, India
fYear :
2009
fDate :
14-15 Dec. 2009
Firstpage :
1
Lastpage :
5
Abstract :
Mining sequential patterns in biological data has attracted a great deal of attention in the last couple of years. Biologists are interested in finding the frequent orderly arrangement of motifs that may be responsible for similar expression of a group of genes. The size of the output space can be greatly reduced if only the maximal frequent patterns are reported. In this paper we present maximal PrefixSpan algorithm which reports maximal frequent patterns in the sequence database. Experimental results on synthetic data shows that the size of the output space is greatly reduced when only maximal frequent patterns are reported.
Keywords :
biology computing; data mining; biological data; maximal PrefixSpan algorithm; maximal frequent pattern; maximal subsequence pattern discovery; sequence database; sequential pattern mining; Computer science; Costs; Data mining; Databases; Proteins; Sampling methods; Testing; Maximal frequent sequences; Sequence mining; TFBS;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Methods and Models in Computer Science, 2009. ICM2CS 2009. Proceeding of International Conference on
Conference_Location :
Delhi
Print_ISBN :
978-1-4244-5051-0
Type :
conf
DOI :
10.1109/ICM2CS.2009.5397958
Filename :
5397958
Link To Document :
بازگشت