DocumentCode :
2866538
Title :
On the stationarity of multivariate time series for correlation-based data analysis
Author :
Yang, Kiyoung ; Shahabi, Cyrus
Author_Institution :
Comput. Sci. Dept., Southern California Univ., Los Angeles, CA, USA
fYear :
2005
fDate :
27-30 Nov. 2005
Abstract :
Multivariate time series (MTS) data sets are common in-various multimedia, medical and financial application domains. These applications perform several data-analysis operations on large number of MTS data sets such as similarity searches, feature-subset-selection, clustering and classifications. Correlation-based techniques, such as principal component analysis (PCA), have proven to improve the efficiency of many of the above-mentioned data-analysis operations on MTS, which implies that the correlation coefficients concisely represent the original MTS data. However, if the statistical properties (e.g., variance) of MTS data change over time dimension, i.e., MTS data is non-stationary, the correlation coefficients are not stable. In this paper, we propose to utilize the stationarity of the MTS data sets, in order to represent the original MTS data more stably, as well as concisely with the correlation coefficients. That is, before performing any correlation-based data analysis, we first executes the stationarity test to decide whether the MTS data is stationary or not, i.e., whether the correlation is stable or not. Subsequently, for a non-stationary MTS data set, we difference it to render the data set stationary. Even though our approach is general, to focus the discussion we describe our approach within the context of our previously proposed technique for MTS similarity search. In order to show the validity of our approach, we performed several experiments on four real-world data sets. The results show that the performance of our similarity search technique have significantly improved in terms of precision/recall.
Keywords :
data analysis; search problems; time series; correlation-based data analysis; multivariate time series; similarity search; stationarity test; Application software; Computer science; Data analysis; Data mining; Euclidean distance; Performance evaluation; Principal component analysis; Testing; Time measurement; Time series analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, Fifth IEEE International Conference on
ISSN :
1550-4786
Print_ISBN :
0-7695-2278-5
Type :
conf
DOI :
10.1109/ICDM.2005.109
Filename :
1565787
Link To Document :
بازگشت