A Modified K-means Algorithm for Sequence Clustering

Author

Hsu, Jia-Lien ; Yang, Hong-Xiang

Author_Institution

Dept. of Comput. Sci. & Inf. Eng., Fu Jen Catholic Univ., Taipei, Taiwan

Volume

1

fYear

2009

fDate

12-14 Aug. 2009

Firstpage

287

Lastpage

292

Abstract

In this paper, we extend our research to construct a system which provides clustering services, more than user-active search. We use DCT mapping to extract features from sequences, and discuss sequence similarities of whole similarity and partial similarity. The two kinds of similarity concepts will be applied when clustering sequences of equal-length and variable-length, respectively.In the case of equal-length, we map a sequence to a dimensional point in the feature space, and then cluster these sequences accordingly by applying hierarchical clustering and partitional clustering (i.e., K-means). In the case of variable-length, we cut a sequence into subsequences by sliding window, and map subsequences to f-dimensional points. We propose a Modified K-means (MK) algorithm to handle partial similarity of subsequences. Finally, we implement our methods and perform experiments to show the efficiency and effectiveness of our approach.

Keywords

discrete cosine transforms; pattern clustering; discrete cosine transform; feature extraction; hierarchical clustering; modified k-mean algorithm; sequence clustering; user-active search; Clustering algorithms; Computer science; Data mining; Discrete Fourier transforms; Discrete cosine transforms; Feature extraction; Hybrid intelligent systems; Indexing; Multimedia databases; Partitioning algorithms; Clustering; K-means; Sequences;

fLanguage

English

Publisher

ieee

Conference_Titel

Hybrid Intelligent Systems, 2009. HIS '09. Ninth International Conference on

Conference_Location

Shenyang

Print_ISBN

978-0-7695-3745-0

Type

conf

DOI

10.1109/HIS.2009.64

Filename

5254352