DocumentCode :
2819889
Title :
Improved Pattern Matching to Find DNA Patterns
Author :
Dudás, L.
Author_Institution :
Dept. of Inf. Eng., Miskolc Univ.
Volume :
2
fYear :
2006
fDate :
25-28 May 2006
Firstpage :
345
Lastpage :
349
Abstract :
The process of finding given patterns in DNA sequences is widely used in modern biological sciences. This paper shows an algorithmic improvement for exact pattern matching introducing a new heuristic. For implementation the author created an application that uses three well-known heuristics to ensure the O(n) worst case time, the O(n log sigma (m)/m) average case time and the O(n/m) best case time of searching for an m length pattern in an n length text that use a sigma letter alphabet. This application served as a testbed for the new H4 heuristic. The novelty is in optimization of the direction of text window movement in the preprocessing phase. The idea takes advantages of RAM based searching: usually all the text resides in today´s gigabyte memory so the opposite direction of searching window moving requires the same time as the usual. A new function predicts the better moving direction in preprocessing time, based on the unsymmetrical property of pattern. Tests proved that this heuristic may result in fewer jumps and tested characters in the search phase of pattern matching
Keywords :
DNA; biology computing; computational complexity; genetics; pattern matching; random-access storage; DNA patterns; RAM based searching; computational complexity; pattern matching; Biological information theory; Biology; DNA; Hydrogen; Pattern matching; Random access memory; Read-write memory; Sequences; Software algorithms; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automation, Quality and Testing, Robotics, 2006 IEEE International Conference on
Conference_Location :
Cluj-Napoca
Print_ISBN :
1-4244-0360-X
Electronic_ISBN :
1-4244-0361-8
Type :
conf
DOI :
10.1109/AQTR.2006.254657
Filename :
4022980
Link To Document :
بازگشت