DocumentCode :
2502096
Title :
Spatial Representation for Efficient Sequence Classification
Author :
Kuksa, Pavel P. ; Pavlovic, Vladimir
Author_Institution :
Dept. of Comput. Sci., Rutgers Univ., Piscataway, NJ, USA
fYear :
2010
fDate :
23-26 Aug. 2010
Firstpage :
3320
Lastpage :
3323
Abstract :
We present a general, simple feature representation of sequences that allows efficient inexact matching, comparison and classification of sequential data. This approach, recently introduced for the problem of biological sequence classification, exploits a novel multi-scale representation of strings. The new representation leads to discovery of very efficient algorithms for string comparison, independent of the alphabet size. We show that these algorithms can be generalized to handle a wide gamut of sequence classification problems in diverse domains such as the music and text sequence classification. The presented algorithms offer low computational cost and highly scalable implementations across different application domains. The new method demonstrates order-of-magnitude running time improvements over existing state-of-the-art approaches while matching or exceeding their predictive accuracy.
Keywords :
biology computing; classification; music; string matching; text analysis; alphabet size; biological sequence classfication; feature representation; inexact matching; music; spatial representation; string comparison; text sequence classification; Algorithm design and analysis; Databases; Feature extraction; Kernel; Mel frequency cepstral coefficient; Proteins; sequence classification; spatial representation; spatial sample kernels; supervised and semi-supervised string kernels;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition (ICPR), 2010 20th International Conference on
Conference_Location :
Istanbul
ISSN :
1051-4651
Print_ISBN :
978-1-4244-7542-1
Type :
conf
DOI :
10.1109/ICPR.2010.1159
Filename :
5597154
Link To Document :
بازگشت