DocumentCode
3078584
Title
Association rule based frequent pattern mining in biological sequences
Author
Salim, Azzeddine ; Chandra, S. S. Vinod
Author_Institution
Dept. of Comput.-Sci. & Eng., Coll. of Eng., Thiruvananthapuram, India
fYear
2013
fDate
26-28 Dec. 2013
Firstpage
1
Lastpage
5
Abstract
To find all frequent patterns present in a set of strings is computationally intensive. An exhaustive search, where every possible candidate is taken into consideration, is not practical for larger pattern widths due to exponential computational complexity. Other approaches apply heuristics, where algorithm tries to reduce search space, but may compromise the accuracy of results to certain extent. We used modified Apriori algorithm to mine possible patterns in a very long sequence, especially most frequent substring pattern of a fixed length in biological sequence. The algorithm gives good performance by rapid reduction in search space, and computations using bit-wise operations instead of expensive string comparison operations. This algorithm outperform existing pattern finding methods such as MEME in terms of execution time.
Keywords
biology computing; data mining; genomics; string matching; association rule based frequent pattern mining; biological sequences; bit-wise operations; execution time; modified apriori algorithm; most-frequent substring pattern; search space reduction; Algorithm design and analysis; Bioinformatics; Databases; Generators; Genomics; Pattern matching; Apriori; Genomic Sequences; Most Frequent Pattern;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Intelligence and Computing Research (ICCIC), 2013 IEEE International Conference on
Conference_Location
Enathi
Print_ISBN
978-1-4799-1594-1
Type
conf
DOI
10.1109/ICCIC.2013.6724203
Filename
6724203
Link To Document