• DocumentCode
    468425
  • Title

    Conceptual Clustering Categorical Data with Uncertainty

  • Author

    Xia, Yuni ; Xi, Bowei

  • Author_Institution
    Indiana Univ., Indianapolis
  • Volume
    1
  • fYear
    2007
  • fDate
    29-31 Oct. 2007
  • Firstpage
    329
  • Lastpage
    336
  • Abstract
    Many real datasets have uncertain categorical attribute values that are only approximately measured or imputed. Uncertainty in categorical data is commonplace in many applications, including biological annotation, medial diagnosis and automatic error detection. In such domains, the exact value of an attribute is often unknown, but may be estimated from a number of reasonable alternatives. Current conceptual clustering algorithms do not provide a convenient means for handling this type of uncertainty. In this paper we extend traditional conceptual clustering algorithm to explicitly handle uncertainty in data values. In this paper we propose new total utility (TU) index for measuring the quality of the clustering. And we develop improved algorithms for efficiently clustering uncertain categorical data, based on the COBWEB conceptual clustering algorithm. Experimental results using real datasets demonstrate how these algorithms and new TU measure can effectively improve the performance of clustering through the use of internal probabilistic information.
  • Keywords
    data handling; uncertainty handling; COBWEB conceptual clustering; categorical data clustering; data values; internal probabilistic information; real datasets; total utility index; uncertainty handling; Artificial intelligence; Clustering algorithms; Data mining; HTML; Proteins; Random number generation; Spatial databases; USA Councils; Uncertainty; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Tools with Artificial Intelligence, 2007. ICTAI 2007. 19th IEEE International Conference on
  • Conference_Location
    Patras
  • ISSN
    1082-3409
  • Print_ISBN
    978-0-7695-3015-4
  • Type

    conf

  • DOI
    10.1109/ICTAI.2007.135
  • Filename
    4410302