Title :
Fast Co-clustering Using Matrix Decomposition
Author :
Ling, Yun ; Ye, Chongyi
Author_Institution :
Coll. of Comput. & Inf. Eng., Zhejiang Gongshang Univ., Hangzhou, China
Abstract :
Co-clustering is a powerful data mining technique with varied applications such as text clustering, web-log mining and microarray analysis. Simultaneously clustering rows and columns (co-clustering) of large data matrix is an important problem with these wide applications. Current co-clustering techniques such as information-theoretic and Bayesian based methods provide good accuracy, but are computationally very expensive. Real data are noisy due to measurement technology limitation and experimental variability which prohibits co-clustering models from revealing true clusters corrupted by noise. Moreover, data matrices involving a large number of rows and columns limit their applicability. In this paper, we utilize correspondence analysis algorithm to process matrix decomposition and then make use of Bayesian approach for co-clustering. We find that utilizing the two methods synthetically is very significative to solve actual problems. Experiments on synthetic and real world data demonstrate the efficiency and effectiveness of our algorithm.
Keywords :
data mining; matrix decomposition; pattern clustering; text analysis; Weblog mining; data matrix; data mining; matrix decomposition; microarray analysis; text clustering; Application software; Bayesian methods; Clustering algorithms; Data analysis; Data mining; Educational institutions; Information processing; Matrix decomposition; Partitioning algorithms; Power engineering computing; Bayesian approach; co-clustering; correspondence analysis; matrix decomposition; relationship;
Conference_Titel :
Information Processing, 2009. APCIP 2009. Asia-Pacific Conference on
Conference_Location :
Shenzhen
Print_ISBN :
978-0-7695-3699-6
DOI :
10.1109/APCIP.2009.186