Author_Institution :
Coll. of Comput. & Inf. Eng., Hunan Univ. of Commerce, Changsha, China
Abstract :
Compared to single classifiers, ensemble learning offers significant and stable performance improvement, while ensemble pruning can improve efficiency and performance of the ensembles further, so both of them are hot topics, not only in traditional machine learning but also in recent data stream mining scopes. Aiming to provide a uniform platform and framework for developing and evaluating ensemble based data mining techniques, LibEDM, an open-source library developed in C++ programming language, is presented, which can also work as a toolkit to solve real-world problems. LibEDM is highly modularized with simple interfaces, making it easy to extend and user-friendly. LibEDM contains popular methods for single classifiers, ensemble learning, stream-based ensemble and ensemble pruning. It also provides representative functions for data preprocessing and classifier evaluating, such as cross-validation and statistical tests, etc. By using LibEDM, researchers and developers can implement, evaluate, compare and apply ensemble techniques with much less effort than before.
Keywords :
C++ language; data mining; learning (artificial intelligence); C++ programming language; LibEDM; data preprocessing; data stream mining scopes; ensemble based data mining; ensemble learning; ensemble pruning; machine learning; open-source library; stable performance improvement; statistical tests; stream-based ensemble; uniform platform; Bagging; Classification algorithms; Data mining; Libraries; Prediction algorithms; Support vector machine classification; Training; C++; data stream mining; ensemble learning; ensemble pruning; toolkit;