DocumentCode :
2771499
Title :
Finding Time Series Motifs in Disk-Resident Data
Author :
Mueen, Abdullah ; Keogh, Eamonn ; Bigdely-Shamlo, Nima
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. of California, Riverside, CA, USA
fYear :
2009
fDate :
6-9 Dec. 2009
Firstpage :
367
Lastpage :
376
Abstract :
Time series motifs are sets of very similar subsequences of a long time series. They are of interest in their own right, and are also used as inputs in several higher-level data mining algorithms including classification, clustering, rule-discovery and summarization. In spite of extensive research in recent years, finding exact time series motifs in massive databases is an open problem. Previous efforts either found approximate motifs or considered relatively small datasets residing in main memory. In this work, we describe for the first time a disk-aware algorithm to find exact time series motifs in multi-gigabyte databases which contain on the order of tens of millions of time series. We have evaluated our algorithm on datasets from diverse areas including medicine, anthropology, computer networking and image processing and show that we can find interesting and meaningful motifs in datasets that are many orders of magnitude larger than anything considered before.
Keywords :
data mining; time series; data mining; disk-aware algorithm; disk-resident data; time series motifs; Biomedical imaging; Classification algorithms; Clustering algorithms; Computer science; DNA; Data engineering; Data mining; Image databases; Multidimensional systems; USA Councils; closest pair; exact algorithm; time series motif;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2009. ICDM '09. Ninth IEEE International Conference on
Conference_Location :
Miami, FL
ISSN :
1550-4786
Print_ISBN :
978-1-4244-5242-2
Electronic_ISBN :
1550-4786
Type :
conf
DOI :
10.1109/ICDM.2009.15
Filename :
5360262
Link To Document :
بازگشت