Title :
A topic detection approach based on multi-level clustering
Author :
Song, Yang ; Du, Junping ; Hou, Lisha
Author_Institution :
Beijing Key Lab. of Intell. Telecommun. Software & Multimedia, Beijing Univ. of Posts & Telecommun., Beijing, China
Abstract :
Text clustering is the major route for topic detection. The major shortcoming which the current algorithms always suffers is the high computing complexity and great time cost when the number of instance is too large. We introduce a new algorithm which cluster the text copra is two steps: in the C-process we divide the copra into some overlapping subsets using Canopy clustering; in the K-process we take X-means algorithm to generate rough clusters from the canopies which share common instance. Experiments show this text clustering technique reveals the true number of the clusters from the copra and runs faster than Single-pass and K-means clustering algorithms.
Keywords :
information retrieval; pattern clustering; text analysis; C-process; K-means clustering algorithms; K-process; X-means algorithm; canopy clustering; computing complexity; multilevel clustering; rough clusters; single-pass clustering algorithms; text clustering technique; time cost; topic detection; Algorithm design and analysis; Clustering algorithms; Euclidean distance; Feature extraction; Security; Time measurement; Vectors; K-means clustering; canopy clustering; multi-level; topic detection;
Conference_Titel :
Control Conference (CCC), 2012 31st Chinese
Conference_Location :
Hefei
Print_ISBN :
978-1-4673-2581-3