DocumentCode :
73238
Title :
Highly Comparative Feature-Based Time-Series Classification
Author :
Fulcher, Ben D. ; Jones, Nick S.
Author_Institution :
Dept. of Phys., Univ. of Oxford, Oxford, UK
Volume :
26
Issue :
12
fYear :
2014
fDate :
Dec. 1 2014
Firstpage :
3026
Lastpage :
3037
Abstract :
A highly comparative, feature-based approach to time series classification is introduced that uses an extensive database of algorithms to extract thousands of interpretable features from time series. These features are derived from across the scientific time-series analysis literature, and include summaries of time series in terms of their correlation structure, distribution, entropy, stationarity, scaling properties, and fits to a range of time-series models. After computing thousands of features for each time series in a training set, those that are most informative of the class structure are selected using greedy forward feature selection with a linear classifier. The resulting feature-based classifiers automatically learn the differences between classes using a reduced number of time-series properties, and circumvent the need to calculate distances between time series. Representing time series in this way results in orders of magnitude of dimensionality reduction, allowing the method to perform well on very large data sets containing long time series or time series of different lengths. For many of the data sets studied, classification performance exceeded that of conventional instance-based classifiers, including one nearest neighbor classifiers using euclidean distances and dynamic time warping and, most importantly, the features selected provide an understanding of the properties of the data set, insight that can guide further scientific investigation.
Keywords :
data mining; feature extraction; feature selection; greedy algorithms; learning (artificial intelligence); pattern classification; time series; Euclidean distances; data mining; dimensionality reduction; dynamic time warping; extensive database; feature-based classifiers; greedy forward feature selection; highly comparative feature-based time-series classification; instance-based classifiers; interpretable feature extraction; linear classifier; nearest neighbor classifiers; scientific time-series analysis literature; training set; very large data sets; Data mining; Databases; Feature extraction; Market research; Time measurement; Time series analysis; Time-series analysis; classification; data mining;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2014.2316504
Filename :
6786425
Link To Document :
بازگشت