DocumentCode
1722449
Title
A new method for finding approximate repetitions in DNA sequences
Author
Jiang, Yajun ; Yang, Zhenlun ; Zhan, Zengrong
Author_Institution
Sch. of Inf. Eng., Guangzhou Panyu Polytech., Guangzhou, China
Volume
2
fYear
2010
Abstract
Searching for approximate repetitions in a DNA sequence has been an important topic in gene analysis. One of the problems in the study is that because of the varying lengths of patterns, the similarity between patterns cannot be judged accurately if we use only the concept of ED (Edit Distance). In this paper we shall make effort to define a new function to compute similarity, which considers both the difference and sameness between patterns at the same time. Seeing the computational complexity, we shall also propose new filter methods based on frequency vector, with which we can sort out candidate set of approximate repetitions efficiently. We use SUA instead of sliding window to get the fragments in a DNA sequence, so that the patterns of an approximate repetition have no limitation on length. The results show that with this technique we are able to find a bigger number of approximate repetitions than that of those found with tandem repeat finder.
Keywords
DNA; biology computing; computational complexity; data analysis; genetics; sequences; DNA sequences; approximate repetitions; computational complexity; filter methods; frequency vector; gene analysis; Algorithm design and analysis; Approximation algorithms; Approximation methods; Correlation; DNA; Indexes; Signal processing algorithms; DNA sequences; SUA; approximate repetitions; similarity;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing Systems (ICSPS), 2010 2nd International Conference on
Conference_Location
Dalian
Print_ISBN
978-1-4244-6892-8
Electronic_ISBN
978-1-4244-6893-5
Type
conf
DOI
10.1109/ICSPS.2010.5555809
Filename
5555809
Link To Document