Title :
Generality-based conceptual clustering with probabilistic concepts
Author :
Talavera, Luis ; Béjar, Javier
Author_Institution :
Dept. de Llenguatges i Sistemes Inf., Univ. Politecnica de Catalunya, Barcelona, Spain
fDate :
2/1/2001 12:00:00 AM
Abstract :
Statistical research in clustering has almost universally focused on data sets described by continuous features and its methods are difficult to apply to tasks involving symbolic features. In addition, these methods are seldom concerned with helping the user in interpreting the results obtained. Machine learning researchers have developed conceptual clustering methods aimed at solving these problems. Following a long term tradition in AI, early conceptual clustering implementations employed logic as the mechanism of concept representation. However, logical representations have been criticized for constraining the resulting cluster structures to be described by necessary and sufficient conditions. An alternative are probabilistic concepts which associate a probability or weight with each property of the concept definition. In this paper, we propose a symbolic hierarchical clustering model that makes use of probabilistic representations and extends the traditional ideas of specificity-generality typically found in machine learning. We propose a parameterized measure that allows users to specify both the number of levels and the degree of generality of each level. By providing some feedback to the user about the balance of the generality of the concepts created at each level and given the intuitive behavior of the user parameter, the system improves user interaction in the clustering process
Keywords :
pattern clustering; probability; AI; continuous features; data sets; feedback; generality-based conceptual clustering; logical representations; machine learning; necessary and sufficient conditions; probabilistic concepts; probabilistic representations; statistical research; symbolic features; symbolic hierarchical clustering model; user interaction; Artificial intelligence; Clustering methods; Data analysis; Data mining; Feedback; Logic; Machine learning; Sufficient conditions; Tree data structures; Unsupervised learning;
Journal_Title :
Pattern Analysis and Machine Intelligence, IEEE Transactions on