Clustering of multivariate time-series data

Author

Singhal, Ashish ; Seborg, Dale E.

Author_Institution

Dept. of Chem. Eng., California Univ., Santa Barbara, CA, USA

Volume

5

fYear

2002

fDate

2002

Firstpage

3931

Abstract

A new methodology for clustering multivariate time-series data is proposed. The methodology is based on calculation of the degree of similarity between multivariate time-series datasets using two similarity factors. One similarity factor is based on principal component analysis and the angles between the principal component subspaces while the other is based on the Mahalanobis distance between the datasets. The standard K-means algorithm is modified to cluster multivariate time-series datasets using similarity factors. Data from a highly nonlinear acetone-butanol fermentation example are clustered to demonstrate the effectiveness of the proposed methodology. Comparisons with existing clustering methods show several advantages of the proposed methodology.

Keywords

pattern clustering; principal component analysis; probability; time series; Mahalanobis distance; datasets; degree of similarity; multivariate time-series data clustering; multivariate time-series datasets; nonlinear acetone-butanol fermentation; principal component analysis; similarity factors; standard K-means algorithm; Chemical engineering; Clustering algorithms; Clustering methods; Data engineering; Databases; Fault detection; Fault diagnosis; Multidimensional systems; Principal component analysis; Process control;

fLanguage

English

Publisher

ieee

Conference_Titel

American Control Conference, 2002. Proceedings of the 2002

ISSN

0743-1619

Print_ISBN

0-7803-7298-0

Type

conf

DOI

10.1109/ACC.2002.1024543

Filename

1024543