• DocumentCode
    2533258
  • Title

    Hypergraph Partition with Harmonic Average Top-N and PCA for Topic Detection

  • Author

    Liu, Xinyue ; Ma, Fenglong ; Lin, Hongfei ; Shen, Hong

  • Author_Institution
    Sch. of Comput. Sci. & Technol., Dalian Univ. of Technol., Dalian, China
  • fYear
    2010
  • fDate
    18-20 Dec. 2010
  • Firstpage
    269
  • Lastpage
    276
  • Abstract
    An algorithm named SMHP is proposed, which aims at improving the efficiency of Topic Detection. In SMHP, a T-MI-TFIDF model is designed by introducing mutual information (MI) and enhancing the weight of terms in the title. Then VSM is constructed according to terms´ weight, and the dimension is reduced by combining H-TOPN and PCA. Then topics are grouped based on SMHP. Experiment results show the proposed methods are more suitable for clustering topics. SMHP with novel approaches can effectively solve the relationship of multiple stories problem and improve the accuracy of cluster results.
  • Keywords
    graph theory; information retrieval; pattern clustering; principal component analysis; text analysis; H-TOPN; PCA; SMHP algorithm; T-MI-TFIDF model; VSM; clustering topics; harmonic average Top-N; hypergraph partition; multiple stories problem; mutual information; topic detection; Algorithm design and analysis; Clustering algorithms; Harmonic analysis; Mutual information; Noise; Partitioning algorithms; Principal component analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Architectures, Algorithms and Programming (PAAP), 2010 Third International Symposium on
  • Conference_Location
    Dalian
  • Print_ISBN
    978-1-4244-9482-8
  • Type

    conf

  • DOI
    10.1109/PAAP.2010.38
  • Filename
    5715093