مرکز منطقه ای اطلاع رساني علوم و فناوري - Bayesian model averaging of Bayesian network classifiers over multiple node-orders: application to sparse datasets

DocumentCode :

1240262

Title :

Bayesian model averaging of Bayesian network classifiers over multiple node-orders: application to sparse datasets

Author :

Hwang, Kyu-Baek ; Zhang, Byoung-Tak

Author_Institution :

Sch. of Comput. Sci. & Eng., Seoul Nat. Univ., South Korea

Volume :

Issue :

fYear :

2005

Firstpage :

1302

Lastpage :

1310

Abstract :

Bayesian model averaging (BMA) can resolve the overfitting problem by explicitly incorporating the model uncertainty into the analysis procedure. Hence, it can be used to improve the generalization performance of Bayesian network classifiers. Until now, BMA of Bayesian network classifiers has only been performed in some restricted forms, e.g., the model is averaged given a single node-order, because of its heavy computational burden. However, it can be hard to obtain a good node-order when the available training dataset is sparse. To alleviate this problem, we propose BMA of Bayesian network classifiers over several distinct node-orders obtained using the Markov chain Monte Carlo sampling technique. The proposed method was examined using two synthetic problems and four real-life datasets. First, we show that the proposed method is especially effective when the given dataset is very sparse. The classification accuracy of averaging over multiple node-orders was higher in most cases than that achieved using a single node-order in our experiments. We also present experimental results for test datasets with unobserved variables, where the quality of the averaged node-order is more important. Through these experiments, we show that the difference in classification performance between the cases of multiple node-orders and single node-order is related to the level of noise, confirming the relative benefit of averaging over multiple node-orders for incomplete data. We conclude that BMA of Bayesian network classifiers over multiple node-orders has an apparent advantage when the given dataset is sparse and noisy, despite the method´s heavy computational cost.

Keywords :

Markov processes; Monte Carlo methods; belief networks; pattern classification; BMA; Bayesian model averaging; Bayesian network classifier; MCMC; Markov chain Monte Carlo sampling technique; multiple node-order; sparse dataset; Bayesian methods; Computational efficiency; Computer networks; Graphical models; Monte Carlo methods; Noise level; Probability distribution; Random variables; Testing; Uncertainty; Bayesian model averaging (BMA); Bayesian networks; Markov chain Monte Carlo (MCMC); classification; sparse data; Algorithms; Artificial Intelligence; Bayes Theorem; Cluster Analysis; Databases, Factual; Information Storage and Retrieval; Neural Networks (Computer); Pattern Recognition, Automated;

fLanguage :

English

Journal_Title :

Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on

Publisher :

ieee

ISSN :

1083-4419

Type :

jour

DOI :

10.1109/TSMCB.2005.850162

Filename :

1542274

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1240262