DocumentCode
390401
Title
Learning Bayesian network classifiers from data with missing values
Author
Zhang, Hongwei ; Lu, Yuchang
Author_Institution
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
Volume
1
fYear
2002
fDate
28-31 Oct. 2002
Firstpage
35
Abstract
Learning accurate Bayesian network (BN) classifiers from complete databases is a very active research topic in data mining and machine learning. However, in practice, databases are rarely complete. This affects their real world data mining applications. This paper investigates the methods for learning four types well-known Bayesian network classifiers from incomplete databases. These four types BN classifiers are: Naive-Bayes, tree augmented Naive-Bayes, BN augmented Naive-Bayes, and general BN, where the latter two are learned using dependency analysis based algorithms that work only on the database completeness assumption. In order to enable this kind of algorithms to handle with missing data, this paper introduces a novel deterministic method to estimate the (conditional) mutual information from incomplete databases, which can be used to do CI tests, a fundamental step in the dependency analysis based algorithms. The experimental results show that our algorithm is efficient and reliable.
Keywords
belief networks; data mining; database management systems; BN augmented Naive-Bayes method; Bayesian network classifiers learning; Naive-Bayes method; complete databases; data mining; machine learning; tree augmented Naive-Bayes method; Algorithm design and analysis; Bayesian methods; Classification tree analysis; Data mining; Intelligent systems; Iterative algorithms; Laboratories; Machine learning; Spatial databases; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
TENCON '02. Proceedings. 2002 IEEE Region 10 Conference on Computers, Communications, Control and Power Engineering
Print_ISBN
0-7803-7490-8
Type
conf
DOI
10.1109/TENCON.2002.1180966
Filename
1180966
Link To Document