DocumentCode :
468157
Title :
Learning Selective Averaged One-Dependence Estimators for Probability Estimation
Author :
Wang, Qing ; Zhou, Chuan-Hua ; Guo, Jian-Kui
Author_Institution :
Anhui Univ. of Technol., Anhui
Volume :
1
fYear :
2007
fDate :
24-27 Aug. 2007
Firstpage :
492
Lastpage :
496
Abstract :
Naive Bayes is a well-known effective and efficient classification algorithm, but its probability estimation performance is poor. Averaged one-dependence estimators, simply AODE, is a recently proposed semi-naive Bayes algorithm and demonstrates significantly high classification accuracy at a modest cost. In many data mining applications, however, accurate probability estimation is more desirable when making optimal decisions. Usually, probability estimation performance is measured by conditional log likelihood (CLL). In this paper, we first study the probability estimation performance of AODE and compare it to naive Bayes, tree- augumented naive Bayes, CLLTree, C4.4 (the improved version of C4.5 for better probability estimation) and Support Vector Machines. From our experiments, we find that AODE performs significantly better than the algorithms used to compare except C4.4, and performs slightly better than C4.4 although its classification accuracy is significantly better than C4.5. We then propose an efficient forward greedy feature selection algorithm for AODE and use the CLL score for attribute selection. The experimental results show that our algorithm achieves substantially improvement over AODE and significantly outperforms C4.4. Our experiments are conducted on the basis of 36 UCI data sets that cover a wide range of domains and data characteristics and we run all the algorithms within the Weka platform.
Keywords :
Bayes methods; estimation theory; pattern classification; probability; support vector machines; trees (mathematics); C4.4; CLLTree; Weka platform; averaged one-dependence estimators; classification algorithm; conditional log likelihood; data mining applications; forward greedy feature selection algorithm; probability estimation; semi-naive Bayes algorithm; support vector machines; tree- augumented naive Bayes; Classification algorithms; Costs; Data mining; Engineering management; Equations; Information technology; Machine learning; Support vector machine classification; Support vector machines; Technology management;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery, 2007. FSKD 2007. Fourth International Conference on
Conference_Location :
Haikou
Print_ISBN :
978-0-7695-2874-8
Type :
conf
DOI :
10.1109/FSKD.2007.384
Filename :
4405974
Link To Document :
بازگشت