DocumentCode
2533258
Title
Hypergraph Partition with Harmonic Average Top-N and PCA for Topic Detection
Author
Liu, Xinyue ; Ma, Fenglong ; Lin, Hongfei ; Shen, Hong
Author_Institution
Sch. of Comput. Sci. & Technol., Dalian Univ. of Technol., Dalian, China
fYear
2010
fDate
18-20 Dec. 2010
Firstpage
269
Lastpage
276
Abstract
An algorithm named SMHP is proposed, which aims at improving the efficiency of Topic Detection. In SMHP, a T-MI-TFIDF model is designed by introducing mutual information (MI) and enhancing the weight of terms in the title. Then VSM is constructed according to terms´ weight, and the dimension is reduced by combining H-TOPN and PCA. Then topics are grouped based on SMHP. Experiment results show the proposed methods are more suitable for clustering topics. SMHP with novel approaches can effectively solve the relationship of multiple stories problem and improve the accuracy of cluster results.
Keywords
graph theory; information retrieval; pattern clustering; principal component analysis; text analysis; H-TOPN; PCA; SMHP algorithm; T-MI-TFIDF model; VSM; clustering topics; harmonic average Top-N; hypergraph partition; multiple stories problem; mutual information; topic detection; Algorithm design and analysis; Clustering algorithms; Harmonic analysis; Mutual information; Noise; Partitioning algorithms; Principal component analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel Architectures, Algorithms and Programming (PAAP), 2010 Third International Symposium on
Conference_Location
Dalian
Print_ISBN
978-1-4244-9482-8
Type
conf
DOI
10.1109/PAAP.2010.38
Filename
5715093
Link To Document