• DocumentCode
    560327
  • Title

    Text Clustering using Frequent Contextual Termset

  • Author

    Akhriza, Tubagus Mohammad ; Ma, Yinghua ; Li, Jianhua

  • Author_Institution
    Sch. of Commun. & Inf. Syst., Shanghai Jiao Tong Univ., Shanghai, China
  • Volume
    1
  • fYear
    2011
  • fDate
    26-27 Nov. 2011
  • Firstpage
    339
  • Lastpage
    342
  • Abstract
    We introduce frequent contextual term set (FCT) as an alternative concept of term set construction for text clustering which is produced from the interestingness of documents. Comparing to state-of-art term set, the proposed approach has some advantages: (1) more efficient in term set production (2) more effective in storing the vocabulary amongst documents which express the context amongst documents and (3) more suitable to discover specificity of dataset. To utilize FCT we also introduce frequent contextual term set based hierarchical clustering (FCTHC) which adopts the concept of cancroids in K-means with some main differences. The experiment shows that FCT is the correct pattern to perform text clustering and FCTHC provides flexible approach in clusters construction.
  • Keywords
    pattern clustering; text analysis; vocabulary; cancroid concept; dataset specificity discovery; document interestingness; frequent contextual term set based hierarchical clustering; k-means; term set construction; term set production; text clustering; vocabulary storage; Clustering algorithms; Context; Data mining; Equations; Itemsets; Merging; Production; Frequent Contextual Termset; Frequent Itemset; Text clustering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Management, Innovation Management and Industrial Engineering (ICIII), 2011 International Conference on
  • Conference_Location
    Shenzhen
  • Print_ISBN
    978-1-61284-450-3
  • Type

    conf

  • DOI
    10.1109/ICIII.2011.86
  • Filename
    6115455