Title :
Towards Big Data Bayesian Network Learning - An Ensemble Learning Based Approach
Author :
Yan Tang ; Yu Wang ; Cooper, Kendra M. L. ; Ling Li
Author_Institution :
Coll. of Comput. & Inf, Hohai Univ., Nanjing, China
fDate :
June 27 2014-July 2 2014
Abstract :
Recently, we are entering the Big Data era[[1]]. The Bayesian Network (BN), as a directed probabilistic graph model, is providing intuitive knowledge presentation and accurate prediction for many mission critical areas. However, the current algorithms do not scale well for Big Data Bayesian network learning. This paper proposes a novel parallel BN learning algorithm called PENBays (Parallel ENsemble based Bayesian Networks Learning), which integrates the best BN learning algorithms MMHC, TPDA and REC. It has three phases: Data Preprocess (DP), Individual Ensemble Learning (IEL) and Central Ensemble Learning (CNL). Through these phases, PENBays effectively learns a BN rapidly from large datasets. Experiments reveal that PENBays learns BNs with better accuracy than base line learning algorithms like MMHC, TPDA and REC, showing promising application potential in the big data mining area.
Keywords :
Bayes methods; Big Data; data mining; directed graphs; learning (artificial intelligence); parallel algorithms; Big Data Bayesian network learning; Big Data mining; CNL; DP; IEL; MMHC; PENBays; REC; TPDA; central ensemble learning; data preprocess; directed probabilistic graph model; ensemble learning based approach; individual ensemble learning; knowledge presentation; parallel BN learning algorithm; parallel ensemble based Bayesian network learning; Algorithm design and analysis; Bayes methods; Big data; Computational modeling; Data models; IEL; Prediction algorithms; Bayesian network; Big Data; Distributed computing; Ensemble learning;
Conference_Titel :
Big Data (BigData Congress), 2014 IEEE International Congress on
Conference_Location :
Anchorage, AK
Print_ISBN :
978-1-4799-5056-0
DOI :
10.1109/BigData.Congress.2014.58