DocumentCode :
2387374
Title :
A Novel Algorithm for Counting All Common Subsequences
Author :
Wang, Hui ; Lin, Zhiwei
Author_Institution :
Univ. of Ulster, Coleraine
fYear :
2007
fDate :
2-4 Nov. 2007
Firstpage :
502
Lastpage :
502
Abstract :
A similarity measure is key to any distance-based data analysis task, e.g., clustering, classification and case-based reasoning. The choice of similarity measure depends on the type of data used in a data analysis task. For sequence data the well known similarity measures include sequence edit distance and longest common subsequence. A new measure of sequence similarity, all common subsequence (ACS), is recently proposed. ACS uses the number of common subsequences between a pair of sequences as a measure of similarity. The more common subsequences there are between them, the more similar they are to each other. A dynamic programming algorithm is presented in H. Wang (2007) to compute ACS, which we call calACS-DP. In this paper we analyze the complexities of this algorithm in time and space, which are both quadratic in the length of sequence. Based on this analysis we present a novel, more efficient algorithm to compute ACS, which we call calACS. This algorithm is derived from an algorithm presented in H. Wang and C. Liu, (2006). The space complexity of this algorithm is O(min{|A|, |B|} + 1) and the time complexity of this algorithm is O(|A| times |B|) for any two sequences A and B. We further illustrate this algorithm with an example.
Keywords :
computational complexity; data analysis; dynamic programming; case-based reasoning; distance-based data analysis task; dynamic programming algorithm; sequence data; time complexity; Algorithm design and analysis; Computer networks; Computer security; Cryptography; Data analysis; Data security; Dynamic programming; Heuristic algorithms; Mathematics; Multidimensional systems;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Granular Computing, 2007. GRC 2007. IEEE International Conference on
Conference_Location :
Fremont, CA
Print_ISBN :
978-0-7695-3032-1
Type :
conf
DOI :
10.1109/GrC.2007.112
Filename :
4403150
Link To Document :
بازگشت