DocumentCode :
2533258
Title :
Hypergraph Partition with Harmonic Average Top-N and PCA for Topic Detection
Author :
Liu, Xinyue ; Ma, Fenglong ; Lin, Hongfei ; Shen, Hong
Author_Institution :
Sch. of Comput. Sci. & Technol., Dalian Univ. of Technol., Dalian, China
fYear :
2010
fDate :
18-20 Dec. 2010
Firstpage :
269
Lastpage :
276
Abstract :
An algorithm named SMHP is proposed, which aims at improving the efficiency of Topic Detection. In SMHP, a T-MI-TFIDF model is designed by introducing mutual information (MI) and enhancing the weight of terms in the title. Then VSM is constructed according to terms´ weight, and the dimension is reduced by combining H-TOPN and PCA. Then topics are grouped based on SMHP. Experiment results show the proposed methods are more suitable for clustering topics. SMHP with novel approaches can effectively solve the relationship of multiple stories problem and improve the accuracy of cluster results.
Keywords :
graph theory; information retrieval; pattern clustering; principal component analysis; text analysis; H-TOPN; PCA; SMHP algorithm; T-MI-TFIDF model; VSM; clustering topics; harmonic average Top-N; hypergraph partition; multiple stories problem; mutual information; topic detection; Algorithm design and analysis; Clustering algorithms; Harmonic analysis; Mutual information; Noise; Partitioning algorithms; Principal component analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Architectures, Algorithms and Programming (PAAP), 2010 Third International Symposium on
Conference_Location :
Dalian
Print_ISBN :
978-1-4244-9482-8
Type :
conf
DOI :
10.1109/PAAP.2010.38
Filename :
5715093
Link To Document :
بازگشت