Title :
A hybrid algorithm applied to classify unbalanced data
Author :
Lee, C.Y. ; Yang, M.R. ; Chang, L.Y. ; Lee, Z.J.
Author_Institution :
Dept. of Inf. Manage., Lan Yang Inst. of Technol., I Lan, Taiwan
Abstract :
Unbalanced data, minority classes with few samples, present in many applications. It is difficult to solve the problems of unbalanced data by traditional methods. In this paper, a hybrid algorithm based on random over-sampling, decision tree (DT), particle swarm optimization (PSO) and feature selection is proposed to classify unbalanced data. The proposed algorithm has the ability to select beneficial feature subsets, automatically adjust values of parameter and obtain the best classification accuracy. The zoo dataset is used to test the performance for the proposed algorithm. From simulation results, the classification accuracy of the proposed algorithm outperforms other existing methods.
Keywords :
data handling; decision trees; particle swarm optimisation; sampling methods; decision tree; feature selection; feature subset; hybrid algorithm; particle swarm optimization; random over sampling; unbalanced data classification; zoo dataset; Biological system modeling; Classification algorithms; Computational modeling; Dairy products; Data models; Optimization; Support vector machines;
Conference_Titel :
Networked Computing and Advanced Information Management (NCM), 2010 Sixth International Conference on
Conference_Location :
Seoul
Print_ISBN :
978-1-4244-7671-8
Electronic_ISBN :
978-89-88678-26-8