DocumentCode :
3134471
Title :
String join using precedence count matrix
Author :
Cao, Xia ; Tung, Anthony K H ; Ooi, Beng Chin ; Tan, Kian-Lee ; Li, Shuai Cheng
Author_Institution :
Dept. of Comput. Sci., National Univ. of Singapore, Singapore
fYear :
2004
fDate :
21-23 June 2004
Firstpage :
345
Lastpage :
348
Abstract :
In this paper; we propose a filter-and-refine string join algorithm. While the filtering phase can rapidly prune away strings that are not joinable, the refinement phase employs a comprehensive algorithm to remove the remaining false alarms. The efficiency of the proposed scheme lies in the use of the precedence count matrix (PCM) for computing the edit distance between two sequences. With PCM, the complexity of sequence comparison is a constant time. We also evaluated the proposed sequence join algorithm, and our study shows that it outperforms the known techniques.
Keywords :
DNA; distributed databases; genetics; query languages; relational databases; scientific information systems; string matching; DNA sequences; constant time complexity; false alarm removal; filter-and-refine string join algorithm; genomic applications; precedence count matrix; sequence comparison; sequence edit distance computing; sequence join algorithm; string data manipulation; string pruning; string refinement; string similarity; Assembly; Bioinformatics; Computer science; Dynamic programming; Filtering algorithms; Filters; Finance; Genomics; Phase change materials;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Scientific and Statistical Database Management, 2004. Proceedings. 16th International Conference on
ISSN :
1099-3371
Print_ISBN :
0-7695-2146-0
Type :
conf
DOI :
10.1109/SSDM.2004.1311228
Filename :
1311228
Link To Document :
بازگشت