Title :
Implementation of time series data clustering based on SVD for stock data analysis on hadoop platform
Author :
Yonghong Xie ; Wulamu, Aziguli ; Yantao Wang ; Zheng Liu
Author_Institution :
Sch. of Comput. & Commun. Eng., Univ. of Sci. & Technol. Beijing (USTB), Beijing, China
Abstract :
With a growing amount of data, a viable solution is to use a cluster consisting of a large of computers for parallel processing, and Hadoop parallel computing platform is a typical representative. Clustering analysis for time series data is one of the main methods mining time series data, however, general clustering algorithms can´t perform clustering for time series data directly since series data has a special structure. The time series clustering algorithm presented is a combining algorithm from algorithms of Canopy and K-means based on SVD. Using singular value decomposition for feature extraction from the time series data, and then use Canopy and K-means algorithms to clustering analysis the feature data of the time series, at last, the algorithm is implemented on Hadoop platform by Mahout leading to a new clustering method that can handle massive time series data. Finally, this new clustering analysis method is successfully applied to real stock time series data with a satisfactory result.
Keywords :
data mining; feature extraction; parallel processing; pattern clustering; singular value decomposition; time series; Canopy algorithm; Hadoop parallel computing platform; K-means algorithm; SVD; feature data clustering analysis; feature extraction; hadoop platform; massive time series data; parallel processing; singular value decomposition; stock data analysis; time series data clustering; time series data mining; Algorithm design and analysis; Clustering algorithms; Clustering methods; Data mining; Matrix decomposition; Time series analysis; Vectors; SVD; clustering analysis; hadoop; k-means; mahout; stock data; time series data;
Conference_Titel :
Industrial Electronics and Applications (ICIEA), 2014 IEEE 9th Conference on
Conference_Location :
Hangzhou
Print_ISBN :
978-1-4799-4316-6
DOI :
10.1109/ICIEA.2014.6931498