Title :
A Bioinformatics-Inspired Adaptation to Ukkonen´s Edit Distance Calculating Algorithm and Its Applicability towards Distributed Data Mining
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Univ. of Tennessee, Knoxville, TN
Abstract :
Edit distance measures the similarity between two strings (as the minimum number of change, insert or delete operations that transform one string to the other). An edit sequence s is a sequence of such operations and can be used to represent the string resulting from applying s to a reference string. We present a modification to Ukkonenpsilas edit distance calculating algorithm based upon representing strings by edit sequences. We conclude with a demonstration of how using this representation can improve mitochondrial DNA query throughput performance in a distributed computing environment.
Keywords :
bioinformatics; data mining; distributed algorithms; sequences; string matching; Ukkonen edit distance calculating algorithm; bioinformatics; distributed data mining; edit sequence; string representation; Bioinformatics; Bonding; Computer science; DNA; Data mining; Distributed computing; Production; Sequences; Software engineering; Throughput; algorithms; bioinformatics; network throughput;
Conference_Titel :
Computer Science and Software Engineering, 2008 International Conference on
Conference_Location :
Wuhan, Hubei
Print_ISBN :
978-0-7695-3336-0
DOI :
10.1109/CSSE.2008.1014