• DocumentCode
    2335691
  • Title

    Text clustering based on good aggregations

  • Author

    Hotho, Andreas ; Maedche, Alexander ; Staab, Steffen

  • Author_Institution
    Inst. fur Angewandte Inf. und Formale Beschreibungsverfahren, Karlsruhe Univ., Germany
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    607
  • Lastpage
    608
  • Abstract
    Text clustering typically involves clustering in a high dimensional space, which appears difficult with regard to virtually all practical settings. In addition, given a particular clustering result it is typically very hard to come up with a good explanation of why the text clusters have been constructed the way they are. We propose a new approach for applying background knowledge (in terms of an ontology) during preprocessing in order to improve clustering results and allow for selection between results. The results may be distinguished and explained by the corresponding selection of concepts in the ontology. Our results compare favourably with a sophisticated baseline preprocessing strategy
  • Keywords
    data mining; data warehouses; pattern clustering; text analysis; background knowledge; good aggregations; high dimensional space; ontology; preprocessing; text clustering; Clustering algorithms; Clustering methods; Heuristic algorithms; Humans; Knowledge management; Measurement standards; Navigation; Ontologies; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on
  • Conference_Location
    San Jose, CA
  • Print_ISBN
    0-7695-1119-8
  • Type

    conf

  • DOI
    10.1109/ICDM.2001.989577
  • Filename
    989577