DocumentCode
468425
Title
Conceptual Clustering Categorical Data with Uncertainty
Author
Xia, Yuni ; Xi, Bowei
Author_Institution
Indiana Univ., Indianapolis
Volume
1
fYear
2007
fDate
29-31 Oct. 2007
Firstpage
329
Lastpage
336
Abstract
Many real datasets have uncertain categorical attribute values that are only approximately measured or imputed. Uncertainty in categorical data is commonplace in many applications, including biological annotation, medial diagnosis and automatic error detection. In such domains, the exact value of an attribute is often unknown, but may be estimated from a number of reasonable alternatives. Current conceptual clustering algorithms do not provide a convenient means for handling this type of uncertainty. In this paper we extend traditional conceptual clustering algorithm to explicitly handle uncertainty in data values. In this paper we propose new total utility (TU) index for measuring the quality of the clustering. And we develop improved algorithms for efficiently clustering uncertain categorical data, based on the COBWEB conceptual clustering algorithm. Experimental results using real datasets demonstrate how these algorithms and new TU measure can effectively improve the performance of clustering through the use of internal probabilistic information.
Keywords
data handling; uncertainty handling; COBWEB conceptual clustering; categorical data clustering; data values; internal probabilistic information; real datasets; total utility index; uncertainty handling; Artificial intelligence; Clustering algorithms; Data mining; HTML; Proteins; Random number generation; Spatial databases; USA Councils; Uncertainty; Web pages;
fLanguage
English
Publisher
ieee
Conference_Titel
Tools with Artificial Intelligence, 2007. ICTAI 2007. 19th IEEE International Conference on
Conference_Location
Patras
ISSN
1082-3409
Print_ISBN
978-0-7695-3015-4
Type
conf
DOI
10.1109/ICTAI.2007.135
Filename
4410302
Link To Document