DocumentCode :
2849994
Title :
Generation of attribute value taxonomies from data for data-driven construction of accurate and compact classifiers
Author :
Kang, Dae-Ki ; Silvescu, Adrian ; Zhang, Jun ; Honavar, Vasant
Author_Institution :
Dept. of Comput. Sci., Iowa State Univ., Ames, IA, USA
fYear :
2004
fDate :
1-4 Nov. 2004
Firstpage :
130
Lastpage :
137
Abstract :
Attribute value taxonomies (AVT) have been shown to be useful in constructing compact, robust, and comprehensible classifiers. However, in many application domains, human-designed AVTs are unavailable. We introduce AVT-learner, an algorithm for automated construction of attribute value taxonomies from data. AVT-learner uses hierarchical agglomerative clustering (HAC) to cluster attribute values based on the distribution of classes that co-occur with the values. We describe experiments on UCI data sets that compare the performance of AVT-NBL (an AVT-guided naive Bayes learner) with that of the standard naive Bayes learner (NBL) applied to the original data set. Our results show that the AVTs generated by AVT-learner are competitive with human-gene rated AVTs (in cases where such AVTs are available). AVT-NBL using AVTs generated by AVT-learner achieves classification accuracies that are comparable to or higher than those obtained by NBL; and the resulting classifiers are significantly more compact than those generated by NBL.
Keywords :
Bayes methods; data mining; learning (artificial intelligence); pattern classification; AVT-NBL; AVT-learner; UCI data sets; attribute value taxonomies; automated construction; cluster attribute values; compact classifiers; data-driven construction; hierarchical agglomerative clustering; human-designed AVT; naive Bayes learner; Application software; Artificial intelligence; Clustering algorithms; Computer science; Data mining; Instruction sets; Laboratories; Ontologies; Robustness; Taxonomy;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2004. ICDM '04. Fourth IEEE International Conference on
Print_ISBN :
0-7695-2142-8
Type :
conf
DOI :
10.1109/ICDM.2004.10115
Filename :
1410276
Link To Document :
بازگشت