DocumentCode :
3455176
Title :
EasyEnsemble and Feature Selection for Imbalance Data Sets
Author :
Liu, Tian-Yu
Author_Institution :
Sch. of Electr., Shanghai Dianji Univ., Shanghai, China
fYear :
2009
fDate :
3-5 Aug. 2009
Firstpage :
517
Lastpage :
520
Abstract :
There are many labeled data sets which have an unbalanced representation among the classes in them. When the imbalance is large, classification accuracy on the smaller class tends to be lower. In particular, when a class is of great interest but occurs relatively rarely such as cases of fraud, instances of disease, and so on, it is important to accurately identify it. Here we propose a novel algorithm named MIEE (mutual information based feature selection for EasyEnsemble) to treat this problem and improve generalization performance of the EasyEnsemble classifier. Experimental results on the UCI data sets show that MIEE obtain better performance, compared with the asymmetric bagging and EasyEnsemble.
Keywords :
data handling; EasyEnsemble; MIEE; feature selection; imbalance data sets; mutual information; Bagging; Bioinformatics; Biology computing; Diseases; Embryo; Intelligent systems; Machine learning; Mutual information; Sampling methods; Systems biology; EasyEnsemble; feature selection; mutual information; unbalanced data sets;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Bioinformatics, Systems Biology and Intelligent Computing, 2009. IJCBS '09. International Joint Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-0-7695-3739-9
Type :
conf
DOI :
10.1109/IJCBS.2009.22
Filename :
5260440
Link To Document :
بازگشت