• DocumentCode
    2143150
  • Title

    A text clustering algorithm based on category resolve power

  • Author

    Zhou, Faguo ; Zhang, Fan ; Yang, Bingru

  • Author_Institution
    School of Mechanical Electronic & Information Engineering, China University of Mining & Technology, Beijing, China
  • fYear
    2010
  • fDate
    4-6 Dec. 2010
  • Firstpage
    3494
  • Lastpage
    3497
  • Abstract
    As an unsupervised machine learning technology, document clustering has been widely used in many fields, such as Information Retrieval (IR) and Text Categorization (TC). But, because of the bag of words used in document clustering as document index, the feature space of corpus must be high dimension space. This problem makes a negative effect to the efficiency and precision of text clustering. Based on category resolve power, a new feature selection function is constructed. Through integration between this function and document clustering algorithm, a high-powered text clustering algorithm is presented. Experiments on a universal corpus show that it has a good performance.
  • Keywords
    Accuracy; Algorithm design and analysis; Classification algorithms; Clustering algorithms; Machine learning; Partitioning algorithms; Text categorization; dimension reduction; feature selection; result evaluaton; text clustering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Science and Engineering (ICISE), 2010 2nd International Conference on
  • Conference_Location
    Hangzhou, China
  • Print_ISBN
    978-1-4244-7616-9
  • Type

    conf

  • DOI
    10.1109/ICISE.2010.5690994
  • Filename
    5690994