DocumentCode
2984387
Title
Clustering Time Series Using Unsupervised-Shapelets
Author
Zakaria, Jamaluddin ; Mueen, Abdullah ; Keogh, Eamonn
Author_Institution
Dept. of Comput. Sci. & Eng., Univ. of California, Riverside, Riverside, CA, USA
fYear
2012
fDate
10-13 Dec. 2012
Firstpage
785
Lastpage
794
Abstract
Time series clustering has become an increasingly important research topic over the past decade. Most existing methods for time series clustering rely on distances calculated from the entire raw data using the Euclidean distance or Dynamic Time Warping distance as the distance measure. However, the presence of significant noise, dropouts, or extraneous data can greatly limit the accuracy of clustering in this domain. Moreover, for most real world problems, we cannot expect objects from the same class to be equal in length. As a consequence, most work on time series clustering only considers the clustering of individual time series "behaviors," e.g., individual heart beats or individual gait cycles, and contrives the time series in some way to make them all equal in length. However, contriving the data in such a way is often a harder problem than the clustering itself. In this work, we show that by using only some local patterns and deliberately ignoring the rest of the data, we can mitigate the above problems and cluster time series of different lengths, i.e., cluster one heartbeat with multiple heartbeats. To achieve this we exploit and extend a recently introduced concept in time series data mining called shapelets. Unlike existing work, our work demonstrates for the first time the unintuitive fact that shapelets can be learned from unlabeled time series. We show, with extensive empirical evaluation in diverse domains, that our method is more accurate than existing methods. Moreover, in addition to accurate clustering results, we show that our work also has the potential to give insights into the domains to which it is applied.
Keywords
pattern clustering; time series; unsupervised learning; Euclidean distance measure; clustering accuracy; dynamic time warping distance measure; time series clustering; unsupervised shapelet concept; Clustering algorithms; Data mining; Earth; Euclidean distance; Time measurement; Time series analysis; Vectors; clustering; shapelets; time series; unsupervised;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining (ICDM), 2012 IEEE 12th International Conference on
Conference_Location
Brussels
ISSN
1550-4786
Print_ISBN
978-1-4673-4649-8
Type
conf
DOI
10.1109/ICDM.2012.26
Filename
6413851
Link To Document