• DocumentCode
    480607
  • Title

    Clustering Description Extraction Based on Statistical Machine Learning

  • Author

    Zhang, Chengzhi ; Xu, Hongjiao

  • Author_Institution
    Dept. of Inf. Manage., Nanjing Univ. of Sci. & Technol., Nanjing
  • Volume
    2
  • fYear
    2008
  • fDate
    20-22 Dec. 2008
  • Firstpage
    22
  • Lastpage
    26
  • Abstract
    Clustering description problem is one of key issues of the traditional document clustering algorithm. The traditional document algorithm can cluster the objects, but it can not give concept description for the clustered results. Document clustering description is a problem of labeling the clustered results of document collection clustering. It can help users determine whether one of the clusters is relevant to users´ information requirement. Therefore, labeling a clustered set of documents is an important and challenging work in document clustering applications. To resolve the problem of the weak readability of the traditional document clustering results, a method of automatic labeling documents clusters based on machine learning is put forward. Experimental results show that the method based on SVM will provide users with more concise and comprehensive document clustering results. It also reflects the linear trend of clustering description problem.
  • Keywords
    document handling; learning (artificial intelligence); pattern clustering; statistical analysis; clustered document labeling; document clustering description extraction; document collection clustering; statistical machine learning; Clustering algorithms; Data mining; Frequency; Information management; Information technology; Labeling; Learning systems; Machine learning; Machine learning algorithms; Support vector machines; Clustering Description; Document Clustering; Statistical Machine Learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Information Technology Application, 2008. IITA '08. Second International Symposium on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-0-7695-3497-8
  • Type

    conf

  • DOI
    10.1109/IITA.2008.114
  • Filename
    4739719